Part 1: Examing the dataset

Part 2: Exploration

Part 3: Research question

-> Are herbivores longer than carnivores on average?

μ1 -> herbivore mean

μ2 -> carnivore mean

H0: μ1 ≤ μ2

Ha: μ1 > μ2

Selected confidence interval = 95%

to count:

df['diet'].value_counts()

to obtain relevant values:

herbivore_mean_x1 = df.loc[df['diet'] == 'herbivorous']['length'].mean() herbivore_std_dev_s1 = df.loc[df['diet'] == 'herbivorous']['length'].std() herbivore_count_n1 = df.loc[df['diet'] == 'herbivorous']['length'].count()

carnivore_mean_x2 = df.loc[df['diet'] == 'carnivorous']['length'].mean() carnivore_std_dev_s2 = df.loc[df['diet'] == 'carnivorous']['length'].std() carnivore_count_n2 = df.loc[df['diet'] == 'carnivorous']['length'].count()

difference_d0 = 0

numerator = (herbivore_mean_x1 - carnivore_mean_x2) - difference_d0
denominator = math.sqrt( herbivore_std_dev_s1 / herbivore_count_n1 + carnivore_std_dev_s2 / carnivore_count_n2)
z = numerator / denominator

Since P(Z > 12.38) < 0.05

We reject the null.

This means that herbivores are on average larger than carnivores.

Part 4: For fun

for name in df['named_by']: df['named_by'].loc[df['named_by'] == name] = name[-5:-1]